Linked servers enable SQL Server–based applications
to include most any other type of data source to be part of a SQL
statement execution, including being able to directly reference remote
SQL servers. They also make it possible to issue distributed queries,
updates, deletes, inserts, commands, and full transactions on
heterogeneous data sources across your entire company (network). SQL
Server essentially acts as the master query manager. Then, via OLE DB
providers and OLE DB data sources, any compliant data source is easily
referenced from any valid SQL statement or command. For each data
source, either they are directly referenced, or SQL Server creates
provider-specific subqueries issued to a specialized provider. This is
very close to being a federated data management capability across most
heterogeneous data sources.
Unlike remote servers, linked servers have two simple setup steps:
1. | Define the remote server on the local server.
|
2. | Define the method for mapping remote logins on the local server.
|
All linked server
configurations are performed on the local server. The mapping for the
local user to the remote user is stored in the local SQL Server
database. In fact, you don’t need to configure anything in the remote
database. Using linked servers also allows SQL Server to use OLE DB to
link to many data sources other than just SQL Server.
OLE DB is an API that allows
COM/.NET applications to work with databases as well as other data
sources, such as text files and spreadsheets. This capability lets SQL
Server have access to a vast amount of different types of data as if
these other data sources were local SQL Server tables or views. This is
extremely powerful.
Unlike Remote Procedure Calls (and remote servers only), linked servers also allow distributed queries and transactions.
Keep
in mind that when you define linked servers, SQL Server really keeps
these data resources linked in many ways. Most importantly, it keeps
the schema definitions linked. In other words, if the schema of a
remote table on a linked server changes, any server that has links to
it also knows the change (that is, gets the change). Even when the
linked server’s schema comes from something such as Excel, if you
change the Excel spreadsheet in any way, that change is automatically
reflected back at the local SQL Server that has defined that Excel
spreadsheet. This is extremely significant from a metadata and schema
integrity point of view. This is what is meant by “completely linked.”
|
Distributed Queries
Distributed queries access data
stored in OLE DB data sources. SQL Server treats these data sources as
if they contained SQL Server tables. Basically, via a provider such as
OLE DB, the data source is put in terms of recordsets. Recordsets are
the way SQL Server needs to see any data. The Microsoft SQL Native
Client OLE DB provider (with PROGID SQLNCLI)
is the official OLE DB provider for SQL Server 2008. You can view or
manipulate data through this provider by using the same basic Data
Manipulation Language (DML) syntax as for T-SQL for SQL Server (SELECT, INSERT, UPDATE, or DELETE
statements). The main difference is the table-naming convention.
Distributed queries use a four-part table name syntax for each data
source as follows:
linked_server_name.catalog.schema.object_name
The following distributed
query accesses data from a sales table in an Oracle database, a region
table in a Microsoft Access database, and a customer table in a SQL
Server database—all with a single SQL statement:
SELECT s.sales_amount
FROM access_server...region AS r,
oracle_server..sales_owner.sale AS s,
sql_server.customer_db.dbo.customer AS c
where r.region_id=s.region_id
and s.customer_id=c.customer_id
and r.region_name='Southwest'
and c.customer_name='ABC Steel'
All these data sources are
on completely different physical machines. But with linked servers and
distributed queries, you might not ever realize this.
Distributed Transactions
With SQL
Server distributed transactions, it is now possible to manipulate data
from several different data sources in a single transaction.
Distributed transactions are supported if the OLE DB provider has built
in the XA transactional functionality. For example, suppose two banks
decide to merge. The first bank (let’s call it OraBank) stores all
checking and savings accounts in an Oracle database. The second bank
(let’s call it SqlBank) stores all checking and savings accounts in a
SQL Server 2008 database. A customer has a checking account with
OraBank and a savings account with SqlBank. What would happen if the
customer wanted to transfer $100 from the checking account to the
savings account? You can handle this task by using the following code
while maintaining transactional consistency:
BEGIN DISTRIBUTED TRANSACTION
-- One hundred dollars is subtracted from the savings account.
UPDATE oracle_server..savings_owner.savings_table
SET account_balance = account_balance - 100
WHERE account_number = 12345
-- One hundred dollars is added to the checking account.
UPDATE sql_server.checking_db.dbo.checking_table
SET account_balance = account_balance + 100
WHERE account_number = 98765
COMMIT TRANSACTION;
The transaction is either committed or rolled back on both databases.